Data classification (business intelligence) について

Words near each other

・ Data Carrier Detect
・ Data Catalog Vocabulary
・ Data center
・ Data center bridging
・ Data center environmental control
・ Data center infrastructure efficiency
・ Data center infrastructure management
・ Data center network architectures
・ Data center predictive modeling
・ Data center services
・ Data Centre Specialist Group
・ Data circuit-terminating equipment
・ Data citation
・ Data clarification form
・ Data classification
・ Data classification (business intelligence)
・ Data classification (data management)
・ Data cleansing
・ Data cluster
・ Data codes for Switzerland
・ Data Coding Scheme
・ Data collection
・ Data collector
・ Data compaction
・ Data comparison
・ Data compression
・ Data compression ratio
・ Data compression symmetry
・ Data conditioning
・ Data conferencing

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Data classification (business intelligence) ：ウィキペディア英語版

Data classification (business intelligence)

In business intelligence, data classification has close ties to data clustering, but where data clustering is ''descriptive'', data classification is ''predictive''.〔〔 In essence data classification consists of using variables with known values to predict the unknown or future values of other variables. It can be used in e.g. direct marketing, insurance fraud detection or medical diagnosis.〔Kimball, R. et al. (2008). ''The Data Warehouse Lifecycle Toolkit. (2. Ed.)''. Wiley. ISBN 0-471-25547-5〕
The first step in doing a data classification is to cluster the data set used for category training, to create the wanted number of categories. An algorithm, called the ''classifier'', is then used on the categories, creating a descriptive model for each. These models can then be used to categorize new items in the created classification system.〔Golfarelli, M. & Rizzi, S. (2009). ''Data Warehouse Design : Modern Principles and Methodologies.'' McGraw-Hill Osburn. ISBN 0-07-161039-1〕
According to Golfarelli and Rizzi, these are the measures of effectiveness of the classifier:〔
*''Predictive accuracy'': How well does it predict the categories for new observations?
*''Speed'': What is the computational cost of using the classifier?
*''Robustness'': How well do the models created perform if data quality is low?
*''Scalability'': Does the classifier function efficiently with large amounts of data?
*''Interpretability'': Are the results understandable to users?
Typical examples of input for data classification could be variables such as demographics, lifestyle information, or economical behaviour.
==Challenges for data classification==
There are several challenges in working with data classification. One in particular is that it is necessary for all using categories on e.g. customers or clients, to do the modeling in an iterative process. This is to make sure that change in the characteristics of customer groups does not go unnoticed, making the existing categories outdated and obsolete, without anyone noticing.
This could be of special importance to insurance or banking companies, where fraud detection is extremely relevant. New fraud patterns may come unnoticed, if the methods to surveil these changes and alert when categories are changing, disappearing or new ones emerge, are not developed and implemented.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Data classification (business intelligence)」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース